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ABSTRACT 

Analyses of stellar spectra often begin with the determination of a number of 
parameters that define a model atmosphere. This work presents a prototype for an 
automated spectral classification system that uses a 150 A- wide region around H/3, 
and applies to stars of spectral types A to K with normal (scaled solar) chemical com- 
position. The new tool exploits synthetic spectra based on plane-parallel flux-constant 
model atmospheres. The input data are high signal-to-noise spectra with a resolution 
greater than about 1 A. The output parameters are forced to agree with an external 
scale of effective temperatures, based on the Infrared Flux Method. The system is 
fast - a spectrum is classified in a few seconds- and well-suited for implementation 
on a web server. We estimate upper limits to the lcr random error in the retrieved 
effective temperatures, surface gravities, and metallicities as 100 K, 0.3 dex, and 0.1 
dex, respectively. 
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1 INTRODUCTION 

Mass, radius, and luminosity are some of the most inter- 
esting properties of a star. Unfortunately, it is non-linear 
combinations of them that produce quasi-linear changes on 
a stellar spectrum. Stellar fluxes are commonly interpreted 
in terms of atmospheric temperature, pressure, and chem- 
ical composition. In the context of classical flux-constant 
model atmospheres, these fields are simply specified with 
three scalars: the effective temperature, the surface gravity, 
and a solar-scaled metallicity. Nevertheless, extracting the 
three from an observed spectrum is rarely trivial. 

A spectroscopic classification system has been devel- 
oped independently of the physical parameters. The MK 
system (Morgan, Keenan & Kellman 1943) lays out a series 
of rules to assign spectral classes from medium-low reso- 
lution spectra. This method has the advantage of provid- 
ing a standard reference independent of models. However, 
it is somewhat artificial, in the sense that the defined spec- 
tral classes are obviously correlated with the relevant at- 
mospheric parameters. Moreover, the classical MK system 
does not provide for metal-poor stars. One may note, as 
an example, that the metal-poor giant HD 122563, which 
has an effective temperature around 4600 K, has been of- 
ten classified as a late-F or early-G type star. On the other 
hand, classification methods based on physical parameters 
are more natural, but model-dependent. 
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Both the MK and the physical classification systems 
have their own advantages, and this may be the reason why 
they still coexist. But, the fact that it is the physical param- 
eters what is ultimately demanded for astronomical applica- 
tions is shifting most of the recent research toward the direct 
extraction of those quantities. As recognized by many spe- 
cialists, repeatability and high speed in spectroscopic stellar 
classification can only be achieved by using automatic meth- 
ods. A recent discussion of the most used methods can be 
found in Bailer- Jones (2001). 

Derived stellar parameters may differ when deter- 
mined from different wavelength ranges or spectral features. 
Among other reasons, this may be caused by using models 
that are too simplistic. As the parameters will be derived 
from their expected effect on the spectra, inaccurate predic- 
tions, or neglect of other relevant parameters, will bias the 
results. Details in the implementation of a classification pro- 
cedure are also a reason to worry. The wide range of effective 
temperatures that have been assigned to the metal-poor sub- 
giant HD 140283 in the recent literature serves as testimony 
of the respectable uncertainties still involved in the scale of 
effective temperatures (see, e.g., the discussion in Snider et 
al. 2001). 

Discrepancies produced by interpreting stellar spectra 
with model atmospheres that are too simple will decrease 
as progress in theory takes place. For main-sequence stars, 
remarkable advances are happening as efforts focus on re- 
laxing the assumptions of Local Thermodynamic Equilib- 
rium (LTE) and hydrostatic equilibrium. Systematic differ- 



2 C. Allende Prieto 



ences that arise in the implementation of different methods 
for spectroscopic classification can be controlled by estab- 
lishing standards. A reference implementation would be far 
more powerful than a set of standard stars. Modern infor- 
mation technologies have paved the way for implementing 
a public automatic classification system accessible over the 
Internet. Such open system, if reliable and fast, could serve 
as a standard reference. 

This paper discusses the first steps toward the imple- 
mentation of a prototype for an open classification system. 
This work will only deal with a section of the HR diagram; 
stars with spectral types A to K and scaled solar metal abun- 
dances. Section 2 describes the selected wavelength band, 
§3 the working parameters space, and §4 provides details of 
the implementation. Section 5 is devoted to connecting the 
parameters derived from spectroscopy to a widely accepted 
scale of effective temperatures based on the Infrared Flux 
Method, and checking the performance of the method. Sec- 
tion 6 concludes with a summary of present results and ideas 
for subsequent work. 



2 WAVELENGTH RANGE 

When using equivalent widths and classical flux-constant 
model atmospheres in abundance analyses, the relevant pa- 
rameters can be usually reduced to four: the stellar effec- 
tive temperature (T e ff), the surface gravity (g), the micro- 
turbulence (£), and the metal abundances - commonly con- 
sidered proportional to the iron abundance. For stars with 
spectral types A-K, the most useful and accessible observa- 
tional probes for these parameters are the flux distribution 
and the spectral lines: excitation and ionization equilibrium 
of metals, the pressure-enhanced wings of strong metal lines, 
and the hydrogen lines. Different procedures to derive stellar 
parameters rely on one or several of these indicators. 

At this point we set aside potential difficulties to model 
some of the spectral features mentioned above. We refer the 
reader to the papers by Dragon & Mutschlecner (1980), 
Castelli, Gratton & Kurucz (1997), or Bell, Balachandran 
& Bautista (2001) regarding the modelling of the contin- 
uum flux; Thevenin & Idiart (1999) or Asplund et al. (2000) 
on the calculation of Fe I line profiles; and Fuhrmann, 
Axer & Gehren (1996), Gardiner, Kupka & Smalley (1999), 
Barklem, Piskunov & O'Mara (2000), or Cowley & Castelli 
(2002) on modelling Balmer lines. With the ultimate goal 
of deriving chemical abundances, most spectroscopic obser- 
vations aim at providing reliable line profiles or equivalent 
widths for lines of weak-to-moderate strength. Ideally, one 
would use the same type of spectra to determine the stellar 
atmospheric parameters. In addition, accurate spectropho- 
tometry is challenging and highly vulnerable to reddening. 
Forcing the excitation and ionization equilibrium balance 
for metal lines is, in most cases, insufficient to reliably de- 
termine the quartet (T e s, logy, [Fe/H]|J £). Therefore, we 
resort to a second feature, the Balmer lines. 

We selected a continuous spectral window around H/3: 
4810-4960 A. This wavelength range represents a balance in 

1 [Fe/H] = log jf^j — log [75^5^] 1 where 7V(E) represents the 
number density of the element E. 



many aspects. It is red enough that the continuum opacity 
is well described by H and H~ for the stars under consid- 
eration, avoiding the difficulties of dealing with much more 
complicated (metal) opacities, and it is blue enough that 
the presence of spectral lines makes possible a reliable de- 
termination of the metal abundance - even in metal-poor 
stars. 



3 WORKING DOMAIN 

Use of equivalent widths, quantifying the strength of a spec- 
tral line by a single number, represents a loss of informa- 
tion. Use of line profiles introduces more variables in the 
analysis through the different line broadening mechanisms. 
Accounting for the broadening involves a number of difficul- 
ties in the practical implementation of a classification algo- 
rithm, although it also comes with extra information, e.g. 
projected rotational velocity, or instrument spectral resolu- 
tion. At this point we wish to restrict the classification to 
(T c ff , log g, [Fe/H], £) and, therefore, we use a fixed resolving 
power R = X/SX ~ 5000. We covered the selected spectral 
range with 301 points, equally spaced in wavelength (every 
0.5 A). Our choice of resolution is a compromise: low enough 
to make rotational and macro-turbulent broadening in late- 
type stars negligible, and high enough to be able to recover 
information on the stellar atmospheric parameters. Fig. 1 
shows observations for the Sun (solid; Kurucz et al. 1984), 
Procyon (dashed; Allende Prieto et al. 2002), and Arcturus 
(dash-dotted; Hinkle et al. 2000). The most relevant features 
have been identified in the figure. 

The range for each of the atmospheric parameters was 
selected to avoid some extreme conditions where classical 
model atmospheres in general, or those used here in partic- 
ular, are expected, or known, to fail: cool temperatures at 
which the contribution of molecules to the equation of state 
is incomplete in the models; hot temperatures at which de- 
partures from LTE are important for the atmospheric struc- 
ture; or too extended an atmosphere that the plane-parallel 
approximation is inadequate. The selected domain is: 
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Arcturus is probably cooler than our lower limit for T Q g 
[Griffin & Lynas-Gray (1999) assign 4290 ± 30 K to this 
star], but it serves the purpose of showing an example of 
the coolest spectra in our working sample. As will become 
clear later, our raw T e fjs are systematically higher than other 
scales and thus a star with this effective temperature is tech- 
nically within the limits of our grid. 

Extensive testing showed that the flux in the selected 
spectral window satisfies a one-to-one relationship with most 
of the parameters space. In other words, for a given combi- 
nation of the four atmospheric parameters considered, the 
resulting flux in this window is unique. Approaching the ex- 
treme metal-poor limit, below [Fe/H] <~ —3, degeneracy is 
unavoidable, as metal lines vanished, and so does the infor- 
mation on metallicity, gravity and microturbulence. The flux 
in the selected window is not equally sensitive to changes in 
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Figure 1. Observed fluxes for Procyon, the Sun and Arcturus in the selected window. The original very high dispersion observations 
have been convolved with a Gaussian profile with a FWHM of 1.0 A. The observations in the McDonald atlas of Procyon (Allende Prieto 
et al. 2002) have been continuum normalized by P. S. Barklem (see Barklem et al. 2002 for more details). A missing interval in the 
spectrum of Arcturus is severely affected by telluric lines. 



the different stellar parameters. Changes in T e s affect the 
most the spectrum, followed by variations in the metal abun- 
dance. The effect of changes in the surface gravity is mainly 
felt through the different sensitivity of lines of neutral and 
ionized metals to pressure, and it is more subtle than the 
response to T e g or [Fe/H]. 



4 IMPLEMENTATION 

To find the set of parameters that best reproduces an ob- 
served spectrum, we need to choose an algorithm. We want 
to minimize (or maximize) a function of the stellar parame- 
ters. This will require us to evaluate a function - thus com- 
pute synthetic spectra - a number of times. This number can 
be very large, and therefore a strategy to reduce computing 
time is needed. We have tackled this problem by computing 
a discrete grid and interpolating. We adopted the follow- 
ing increments for the four parameters involved: 500 K in 
T c ff, 1.0 dex in logg, 1.0 dex in [Fe/H], and 1.0 km s _1 in £. 
These were chosen to keep the changes in the spectrum small 
enough so that a fast multilinear interpolation would pro- 
vide a reasonable approximation. Interpolation errors can 
reach up to 2 % for the warmer stars in the grid, but up to 7 
% for the coolest metal-rich stars. These errors are smaller 
than the precision with which the real spectra can be repro- 
duced, and it was later verified that use of a finer grid does 
not improve the performance of the classification method. 
We used a genetic algorithm (GA) to solve our minimiza- 
tion problem. GAs are suitable for solving global optimiza- 
tion problems in complex landscapes where local extrema 
can confuse simpler algorithms (see Charbonneau 2002 for 
an informal introduction). 

The grid of synthetic spectra was based on non- 
overshooting Kurucz (1993) model atmospheres. These mod- 



els include mixing-length convection with a = 1.25, and 
£ = 2.0 km s _1 . The radiative transfer equation was solved 
with the code Synspec (Hubeny & Lanz 2000), using very 
simple continuous opacities: H, H~, Rayleigh and electron 
scattering (as described in Hubeny 1988). An atomic line 
list was prepared with the data obtained from the Vienna 
Atomic Line Database (VALD; Kupka et al. 1999). This line 
list includes 7169 lines that are expected to contribute to the 
opacity in a solar-like atmosphere. The computed spectra 
were degraded to a resolving power of about 5000, sampled 
with a common wavelength vector, and normalized. 

The presence of a very broad line in the spectral win- 
dow (H/3) makes normalization difficult. We have adopted 
a straightforward unsupervised polynomial normalization. 
Although the results of this scheme would visually displease 
most stellar spectroscopists (see Fig. []), this simple proce- 
dure can be carried out quickly, and repeatability is easily 
achieved for a given set (T e ff, logp, [Fe/H], £). In addition, 
the fluxes were also divided by a constant, 1.8, to enforce flux 
values between and 1 - a requirement of the GA software. 

We implemented a FORTRAN routine to perform mul- 
tilinear interpolation in the four parameters under consid- 
eration. This routine is the interface between the grid of 
synthetic spectra and the genetic algorithm. The function 
we chose to maximize is: 



1 - 



/J Wi(fi - Oi) 2 



(2) 



where / is the vector of interpolated synthetic flux, O is the 
vector of observations, and the index i indicates a particular 
wavelength bin. We adopted 



= (/? -of 



" 2 /io 4 , 
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as derived from the solar spectrum and a synthetic flux cal- 
culated with solar parameters, but reset to 
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Figure 2. Comparison between the observed (O) and (linearly interpolated) synthetic (/) corrected fluxes for the Procyon (F5 IV), the 
Sun (G2 V), Arcturus (K1.5 III), and the metal-poor HD 2665 (G5 III; [Fe/H] ~ -1.9). The difference of the two vectors is also plotted. 
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The many necessary multilinear interpolations are very 
fast, as the grid of previously computed synthetic spectra 
is kept in memory. In fact, a non-negligible fraction of the 
time is invested in loading those data. We adopted a pub- 
licly available GA software^], due to D. L. Carroll (see, e.g., 
Carroll 1996). The default parameters were kept, namely, a 
micro-GA with uniform crossover. The GA was run for 500 
generations. This number was chosen from inspection of the 
convergency curve for a limited number of stars, but it was 
later verified that increasing this value to 2000 generations 
did not produce any improvement in the performance. Clas- 
sification of a single spectrum takes about 3 seconds on a 
Sun Ultra5. Fig. 2 illustrates the agreement between the ob- 
served and the matched synthetic spectra for Procyon, the 
Sun, Arcturus, and the metal-poor star HD 2665 ([Fe/H] 
~ —1.9). The spectrum of HD 2665 was obtained from the 
Elodie library (Prugniel & Soubiran 2001). 



5 CONNECTING OUR STELLAR 

PARAMETERS TO THE IRFM Teff SCALE 

Detailed modelling of H/3 is an issue. Even though the wings 
of Balmer lines are considered to form very close to LTE 
conditions, that does not apply to their cores. A tougher 
complication is the fact that Balmer lines are commonly af- 
fected by the temperature distribution in the deepest atmo- 
spheric layers, which for late-type stars are significantly in- 
fluenced by convection. Convection is typically treated using 
a mixing-length formalism which implies a choice for one or 
several parameters. This represents an important additional 
source of uncertainty in the derived parameters - very likely 
a systematic bias in the derived T e g. Such a bias in T^ff, 
because of the tight coupling between T e s, g, and [Fe/H], 
will also produce a bias in the other two parameters. In our 
view, the best possible option is to anchor our spectroscopic 
T e ff scale to a more reliable scale. The systematic effect in 
g and [Fe/H] implied by the necessary correction to our de- 
rived T e g can be easily predicted. Empirical studies have 
shown that a shift in an adopted T e ff will, through the iron 
excitation-ionization balance, translate to a shift in logg: 



2 Available from 



http : //cuaerospace . com/carroll/ga. html 



A log S ~ AT cff /466. (5) 
(Allende Prieto 1998), and a correction to [Fe/H]: 
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Figure 3. The raw effective temperatures derived from fitting the spectral region 4810—4960 A compared to those derived from the 
(B — V) calibrations of Alonso ct al. (1996, 1999). The solid line is a second-order least-squares polynomial fit that we use to correct our 
T c ff scale. 
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(e.g. Gray 1992; Allende Prieto et al. 1999). 

As we will be applying this correction to our T e a scale, 
and the sensitivity to T e a of the selected spectral region re- 
lies largely on the HP profile, there is no need to make a se- 
rious emphasis on the accuracy of the calculated absorption 
profile for H/3. We can make use of some approximations, and 
take advantage of a reduced computing time. In particular, 
we use an approximate broadening treatment described in 
the Appendix B of Hubeny, Hummer & Lanz (1994). As this 
approach underestimates the width of the solar H/3 profile, 
we apply a zeroth order correction, using twice the default 
value. This modification is not strictly necessary, as the cor- 
rection to be applied later to the T c ff scale would be able to 
fix this zero point as well. 

We decided to anchor our spectroscopic T ott - scale to that 
of Alonso, Arribas & Martmez-Roger (1996, 1999), which 
is based on the Infrared Flux Method, as modelled with 
fluxes from Kurucz (1993) model atmospheres. A library 
of high-resolution spectra recently published by Prugniel & 
Soubiran (2001) is used as testing field for our classification 
scheme. This library, hereafter the Elodie library, consists 
of more than 908 spectra from 709 stars spanning a large 
fraction of, and in some instances exceeding, our parameter 
space. We determine the T c g for each star using the Alonso 
et al. calibrations for (B — V). These calibrations use (B — V) 
and [Fe/H], and both parameters were adopted from those 
given in the Elodie library. The division of the stars in IV- 
V or I-III classes, in order to select the appropriate IRFM 
calibration, is based on the gravities provided in the Elodie 
library, setting the division line at logg = 3.8. 

The spectra in the library with a resolution of R = 
10, 000 are convolved with a Gaussian profile with a FWHM 
of 1.0 A, and then fed to our classification system to find a 
best-fitting vector (T e g, g, [Fe/H], £). Then, the values de- 
rived for Te£ w are compared against those from the IRFM 
calibrations. The two scales are confronted in Fig. 3. A 



second-order least-squares polynomial fit is adopted to an- 
chor the derived TIS W to the IRFM scale: 



0.6686 T/if FM 



0.0001159 



rrrIRFM\2 
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and this correction is also translated to logy, and 
[Fe/H], based on the correlation expected for the excitation- 
ionization balance (see Eqs. 5 and 6). Fig. 3 reveals a ten- 
dency for some stars to clump at certain values of T e g, in 
particular at ~ 5375 K. Too aggressive a choice for the steps 
in the grid parameters (with the implied errors in the lin- 
ear interpolation) is not to blame for this systematic effect, 
which is related to caveats involved in the implementation 
of GAs. Noticeably, the scatter is not symmetrically dis- 
tributed about the adopted mean relationship, defined by 
the polynomial fit. This is actually what we expect when 
reddening is not negligible, introducing significant errors in 
the photometric T e ff [which are based on unreddened values 
for (B — V)]. Addressing these and other issues exceeds the 
scope of this exploratory study. 

The corrected T B s, logy, and [Fe/H] values can be di- 
rectly compared to those given in the Elodie library. The 
mean difference and the a rms (607 spectra) are 



logff: 
[Fe/H]: 



37 ± 150 K 
0.16 ±0.52 dex 
0.02 ±0.18 dex 



(8) 



and Fig. 4 compares the two sets of parameters. 

A contribution to the error bars is connected with the 
parameters adopted for the Elodie library, mainly compiled 
from the literature. The library also provides estimates of 
the reliability of the adopted values. Restricting the com- 
parison to the spectra with the most trusted parameters^], 
we find (71 spectra) 



Using the library's code: q(T ef j)=4, q(logg)=l, q([Fe/H])=4 
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Figure 4. Comparison between the stellar parameters selected (mainly from the literature) for the Elodie stars (Prugniel & Soubiran 
1999 and references therein) with those from spectral fitting derived in this work. 



T cS : 21 ± 102 K 

log 5 : 0.07 ± 0.37 dex (9) 
[Fe/H]: 0.02 ± 0.10 dex. 

The oy ms for the same stars between the IRFM T e ffS, 
and those we derived is 80 K. The agreement for T e ff and 
[Fe/H] is satisfying, but it is not so for logg. As explained 
in §4, the sensitivity of the spectrum to this parameter is 
not nearly as high as to the others. We find that 36 stars 
from the last set are included in the determination of stellar 
parameters by Allende Prieto & Lambert (1999) based on 
the comparison of observed colors and parallaxes with evo- 
lutionary models. The mean and median uncertainties for 
the reference gravities are both 0.08 dex. For these stars, we 
find a more satisfactory mean difference and a rm3 in logg: 
0.05 ±0.28 dex, as shown in Fig. ^ which leads us to believe 
that 0.3 dex is a reasonable estimate for our random errors 
in gravity. 



6 CONCLUSIONS AND FUTURE 
APPLICATIONS 

We have implemented a spectroscopic classification algo- 
rithm that provides estimates for T c a, \ogg, [Fe/H], and £ 
based on the observations with a resolving power R 5000 
in the spectral range 4810-4960 A. The classification sys- 
tem is based on synthetic spectra calculated with classical 
flux-constant model atmospheres, and it is anchored to the 
photometric calibrations of effective temperature derived by 
Alonso and collaborators (Alonso et al. 1996, 1999) based 
on the Infrared Flux Method. By using the Elodie spectro- 
scopic library (Prugniel & Soubiran 2001) and the gravities 
determined by Allende Prieto & Lambert (1999) for nearby 
Hipparcos stars, we derive upper limits to the uncertainties 
in the retrieved T e g, logg, and [Fe/H], as 100 K, 0.3 dex, 
and 0.1 dex, respectively. 

Our classification algorithm can be used on spectral 
types A to K, for main-sequence and evolved stars with 
gravities as low as log = 2, and all metallicities. The sys- 
tem is able to classify a stellar spectrum in only 3 seconds 
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Figure 5. Comparison between the gravities determined from Hipparcos' parallaxes by Allende Prieto & Lambert (1999) with those 
from spectral fitting derived in this work. 



on a modern workstation. Work is in progress to improve the 
accuracy of the synthetic spectra by using a more realistic 
line absorption profile for H/3. We also plan on varying the 
parameters that affect the performance of the employed GA, 
and testing different optimization algorithms. An important 
question to answer is how the performance of our system 
improves or degrades with spectral resolution and signal- 
to-noise ratio. Different spectral ranges should be explored. 
Spectral bands with lines of metals whose abundances do not 
scale well with iron should be avoided, unless more param- 
eters are included in the search, which is certainly feasible. 

Different classification algorithms for stellar spectra 
have been tested in the literature. In particular, artificial 
neural networks hold the promise for the highest speeds, 
which may be critical for problems involving a large number 
of free parameters (see, e.g. Bailer- Jones 2001; Snider et al 
2001). When the number of parameters is limited, like in 
the spectroscopic classification considered here, GAs have 
the advantage of not requiring training. This, in turn, al- 
lows us to explore different strategies, such as the selection 
of the spectral range or the resolving power, very quickly, 
while keeping the search global. 

Our final goal is to provide a web interface and make 
this or a similar system publicly available. Future work will 
target hotter spectral types, using calculations based upon 
non-LTE model atmospheres. Extension to the bottom of 
the main-sequence and beyond could follow, although the- 
oretical modelling of cool atmospheres has not yet reached 
the same level of maturity as for warmer stars. The adopted 
strategy can easily accommodate future improvements in 
model atmospheres and spectral synthesis. 
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